Search CORE

6 research outputs found

Improved neural machine translation systems for low resource correction tasks

Author: Harer Jacob Alexander
Publication venue
Publication date: 14/02/2020
Field of study

Recent advances in Neural Machine Translation (NMT) systems have achieved impressive results on language translation tasks. However, the success of these systems has been limited when applied to similar low-resource tasks, such as language correction. In these cases, datasets are often small whilst still containing long sequences, leading to significant overfitting and poor generalization. In this thesis we study issues preventing widespread adoption of NMT systems into low resource tasks, with a special focus on sequence correction for both code and language. We propose two novel techniques for handling these low-resource tasks. The first uses Generative Adversarial Networks to handle datasets without paired data. This technique allows the use of available unpaired datasets which are typically much larger than paired datasets since they do not require manual annotation. We first develop a methodology for generation of discrete sequences using a Wasserstein Generative Adversarial Network, and then use this methodology to train a NMT system on unpaired data. Our second technique converts sequences into a tree-structured representation, and performs translation from tree-to-tree. This improves the handling of very long sequences since it reduces the distance between nodes in the network, and allows the network to take advantage of information contained in the tree structure to reduce overfitting

Boston University Institutional Repository (OpenBU)

Guidelines for Genome-Scale Analysis of Biological Rhythms

Author: Abruzzi Katherine C.
Allada Ravi
Anafi Ron
Arpat Alaaddin Bulak
Asher Gad
Baldi Pierre
Bell-Pedersen Deborah
Blau Justin
Brown Steve
Ceriani M. Fernanda
Chen Zheng
Chiu Joanna C.
Cox Juergen
Crowell Alexander M.
de Bekker Charissa
de Goede Paul
de la Iglesia Horacio O.
DeBruyne Jason P.
Dijk Derk-Jan
DiTacchio Luciano
Doyle Francis J.
Duffield Giles E.
Dunlap Jay C.
Eckel-Mahan Kristin
Esser Karyn A.
FitzGerald Garret A.
Forger Daniel B.
Francey Lauren J.
Fu Ying-Hui
Gachon Frédéric
Gatfield David
Golden Susan S.
Green Carla
Harer John
Harmer Stacey
Haspel Jeff
Hastings Michael H.
Herzel Hanspeter
Herzog Erik D.
Hoffmann Christy
Hogenesch John B.
Hong Christian
Hughes Michael E.
Hughey Jacob J.
Hurley Jennifer M.
Johnson Carl
Kay Steve A.
Koike Nobuya
Kornacker Karl
Kramer Achim
Lamia Katja
Leise Tanya
Lewis Scott A.
Li Jiajia
Li Xiaodong
Liu Andrew C.
Loros Jennifer J.
Martino Tami A.
Menet Jerome S.
Merrow Martha
Millar Andrew J.
Mockler Todd
Naef Felix
Nagoshi Emi
Nitabach Michael N.
Nusinow Dmitri A.
Olmedo Maria
Ptáček Louis J.
Rand David
Reddy Akhilesh B.
Robles Maria S.
Roenneberg Till
Rosbash Michael
Ruben Marc D.
Rund Samuel S.C.
Sancar Aziz
Sassone-Corsi Paolo
Sehgal Amita
Sherrill-Mix Scott
Skene Debra J.
Storch Kai-Florian
Takahashi Joseph S.
Ueda Hiroki R.
Wang Han
Weitz Charles
Westermark Pål O.
Wijnen Herman
Wu Gang
Xu Ying
Yoo Seung-Hee
Young Michael
Zhang Eric Erquan
Zielinski Tomasz
Publication venue
Publication date: 01/01/2017
Field of study

Genome biology approaches have made enormous contributions to our understanding of biological rhythms, particularly in identifying outputs of the clock, including RNAs, proteins, and metabolites, whose abundance oscillates throughout the day. These methods hold significant promise for future discovery, particularly when combined with computational modeling. However, genome-scale experiments are costly and laborious, yielding “big data” that are conceptually and statistically difficult to analyze. There is no obvious consensus regarding design or analysis. Here we discuss the relevant technical considerations to generate reproducible, statistically sound, and broadly useful genome-scale data. Rather than suggest a set of rigid rules, we aim to codify principles by which investigators, reviewers, and readers of the primary literature can evaluate the suitability of different experimental designs for measuring different aspects of biological rhythms. We introduce CircaInSilico, a web-based application for generating synthetic genome biology data to benchmark statistical methods for studying biological rhythms. Finally, we discuss several unmet analytical needs, including applications to clinical medicine, and suggest productive avenues to address them

Carolina Digital Repository

Guidelines for Genome-Scale Analysis of Biological Rhythms

Author: Achim Kramer
Akhilesh B. Reddy
Alaaddin Bulak Arpat
Alexander M. Crowell
Amita Sehgal
Andrew C. Liu
Andrew J. Millar
Aziz Sancar
Carl Johnson
Carla Green
Charissa de Bekker
Charles Weitz
Christian Hong
Christy Hoffmann
Daniel B. Forger
David Gatfield
David Rand
Deborah Bell-Pedersen
Debra J. Skene
Derk-Jan Dijk
Dmitri A. Nusinow
Emi Nagoshi
Eric Erquan Zhang
Erik D. Herzog
Felix Naef
Francis J. Doyle
Frédéric Gachon
Gad Asher
Gang Wu
Garret A. FitzGerald
Giles E. Duffield
Han Wang
Hanspeter Herzel
Herman Wijnen
Hiroki R. Ueda
Horacio O. de la Iglesia
Jacob J. Hughey
Jason P. DeBruyne
Jay C. Dunlap
Jeff Haspel
Jennifer J. Loros
Jennifer M. Hurley
Jerome S. Menet
Jiajia Li
Joanna C. Chiu
John B. Hogenesch
John Harer
Joseph S. Takahashi
Juergen Cox
Justin Blau
Kai-Florian Storch
Karl Kornacker
Karyn A. Esser
Katherine C. Abruzzi
Katja Lamia
Kristin Eckel-Mahan
Lauren J. Francey
Louis J. Ptáček
Luciano DiTacchio
M. Fernanda Ceriani
Marc D. Ruben
Maria Olmedo
Maria S. Robles
Martha Merrow
Michael E. Hughes
Michael H. Hastings
Michael N. Nitabach
Michael Rosbash
Michael Young
Nobuya Koike
Paolo Sassone-Corsi
Paul de Goede
Pierre Baldi
Pål O. Westermark
Ravi Allada
Ren Y
Ron Anafi
Samuel S.C. Rund
Scott A. Lewis
Scott Sherrill-Mix
Seung-Hee Yoo
Stacey Harmer
Steve A. Kay
Steve Brown
Susan S. Golden
Tami A. Martino
Tanya Leise
Till Roenneberg
Todd Mockler
Tomasz Zielinski
Xiaodong Li
Ying Xu
Ying-Hui Fu
Zheng Chen
Publication venue: 'SAGE Publications'
Publication date: 01/01/2017
Field of study

Genome biology approaches have made enormous contributions to our understanding of biological rhythms, particularly in identifying outputs of the clock, including RNAs, proteins, and metabolites, whose abundance oscillates throughout the day. These methods hold significant promise for future discovery, particularly when combined with computational modeling. However, genome-scale experiments are costly and laborious, yielding ‘big data’ that is conceptually and statistically difficult to analyze. There is no obvious consensus regarding design or analysis. Here we discuss the relevant technical considerations to generate reproducible, statistically sound, and broadly useful genome scale data. Rather than suggest a set of rigid rules, we aim to codify principles by which investigators, reviewers, and readers of the primary literature can evaluate the suitability of different experimental designs for measuring different aspects of biological rhythms. We introduce CircaInSilico, a web-based application for generating synthetic genome biology data to benchmark statistical methods for studying biological rhythms. Finally, we discuss several unmet analytical needs, including applications to clinical medicine, and suggest productive avenues to address them

Southampton (e-Prints Soton)

Serveur académique lausannois

Open Access LMU

Edinburgh Research Explorer

Carolina Digital Repository

Warwick Research Archives Portal Repository

Surrey Research Insight

MPG.PuRe

Infoscience - École polytechnique fédérale de Lausanne

Crossref

CONICET Digital

Harvard University - DASH

eScholarship - University of California

ZORA

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

Deep Learning for Source Code Modeling and Generation

Author: Aebersold Simon
Allamanis Miltiadis
Allamanis Miltos
Arjovsky Martin
Ba Jimmy
Balog Matej
Bengio Yoshua
Bielik Pavol
Bielik Pavol
Blunsom Phil
Bornschein Jörg
Charles Brown Neil Christopher
Chen Zimin
Chorowski Jan K.
Chung Junyoung
Cohn Trevor
Csáji Balázs Csanád
Dai Zihang
Devlin Jacob
Dong Li
Finn Chelsea
Gage Philip
Gal Yarin
Ganin Yaroslav
Gaunt Alexander L.
Goodfellow Ian
Grave Edouard
Grefenstette Edward
Gregor Karol
Gupta Anshul
Gupta Rahul
Hao Chen
Harer Jacob
Hashimoto Tatsunori B.
Hava
He Zhen
Ioffe Sergey
Jaderberg Max
Joshi Aravind
Joulin Armand
Just René
Kaiser Łukasz
Kalchbrenner Nal
Kim Yoon
Kingma Diederik P.
Kolen John F.
Konyushkova Ksenia
Koutnik Jan
Laengle Thomas
Lam An Ngoc
Li Chengtao
Lin Chin-Yew
Liu Chang
Ma Haoyu
Maddison Chris
Mikolov Tomas
Minh Le Triet Huynh
Mnih Andriy
Mou Lili
Moura Leonardo De
Muhammad Ali Babar
Nguyen Anh Tuan
Nguyen Trong Duc
Nguyen Tung Thanh
Nguyen Tung Thanh
Papineni Kishore
Pascanu Razvan
Pennington Jeffrey
Pouyanfar Samira
Rae Jack
Rasmus Antti
Rothe Anselm
Salimans Tim
Schwenk Holger
Sodsong Wasuwee
Sukhbaatar Sainbayar
Sutskever Ilya
Sutskever Ilya
Tarvainen Antti
Tieleman Tijmen
Triet H. M. Le
Tucker George
van den Oord Aaron
Vincent
Vinyals Oriol
Waibel Alexander
Wan Li
Wan Yao
Weston Jason E.
Xiao Yan
Xu Kelvin
Zhang Xiang
Zhu Jun-Yan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Recommended from our members

Guidelines for Genome-Scale Analysis of Biological Rhythms.

Author: Abruzzi Katherine C
Allada Ravi
Anafi Ron
Arpat Alaaddin Bulak
Asher Gad
Baldi Pierre
Bell-Pedersen Deborah
Blau Justin
Brown Steve
Ceriani M Fernanda
Chen Zheng
Chiu Joanna C
Cox Juergen
Crowell Alexander M
de Bekker Charissa
de Goede Paul
de la Iglesia Horacio O
DeBruyne Jason P
Dijk Derk-Jan
DiTacchio Luciano
Doyle Francis J
Duffield Giles E
Dunlap Jay C
Eckel-Mahan Kristin
Esser Karyn A
FitzGerald Garret A
Forger Daniel B
Francey Lauren J
Fu Ying-Hui
Gachon Frédéric
Gatfield David
Golden Susan S
Green Carla
Harer John
Harmer Stacey
Haspel Jeff
Hastings Michael H
Herzel Hanspeter
Herzog Erik D
Hoffmann Christy
Hogenesch John B
Hong Christian
Hughes Michael E
Hughey Jacob J
Hurley Jennifer M
Johnson Carl
Kay Steve A
Koike Nobuya
Kornacker Karl
Kramer Achim
Lamia Katja
Leise Tanya
Lewis Scott A
Li Jiajia
Li Xiaodong
Liu Andrew C
Loros Jennifer J
Martino Tami A
Menet Jerome S
Merrow Martha
Millar Andrew J
Mockler Todd
Naef Felix
Nagoshi Emi
Nitabach Michael N
Nusinow Dmitri A
Olmedo Maria
Ptáček Louis J
Rand David
Reddy Akhilesh B
Robles Maria S
Roenneberg Till
Rosbash Michael
Ruben Marc D
Rund Samuel SC
Sancar Aziz
Sassone-Corsi Paolo
Sehgal Amita
Sherrill-Mix Scott
Skene Debra J
Storch Kai-Florian
Takahashi Joseph S
Ueda Hiroki R
Wang Han
Weitz Charles
Westermark Pål O
Wijnen Herman
Wu Gang
Xu Ying
Yoo Seung-Hee
Young Michael
Zhang Eric Erquan
Zielinski Tomasz
Publication venue: eScholarship, University of California
Publication date: 01/10/2017
Field of study

Genome biology approaches have made enormous contributions to our understanding of biological rhythms, particularly in identifying outputs of the clock, including RNAs, proteins, and metabolites, whose abundance oscillates throughout the day. These methods hold significant promise for future discovery, particularly when combined with computational modeling. However, genome-scale experiments are costly and laborious, yielding "big data" that are conceptually and statistically difficult to analyze. There is no obvious consensus regarding design or analysis. Here we discuss the relevant technical considerations to generate reproducible, statistically sound, and broadly useful genome-scale data. Rather than suggest a set of rigid rules, we aim to codify principles by which investigators, reviewers, and readers of the primary literature can evaluate the suitability of different experimental designs for measuring different aspects of biological rhythms. We introduce CircaInSilico, a web-based application for generating synthetic genome biology data to benchmark statistical methods for studying biological rhythms. Finally, we discuss several unmet analytical needs, including applications to clinical medicine, and suggest productive avenues to address them

eScholarship - University of California

Guidelines for Genome-Scale Analysis of Biological Rhythms

Author: Abruzzi Katherine C.
Allada Ravi
Anafi Ron
Arpat Alaaddin Bulak
Asher Gad
Baldi Pierre
Bell-Pedersen Deborah
Blau Justin
Brown Steve
Ceriani M. Fernanda
Chen Zheng
Chiu Joanna C.
Cox Juergen
Crowell Alexander M.
de Bekker Charissa
de Goede Paul
de la Iglesia Horacio O.
DeBruyne Jason P.
Dijk Derk-Jan
DiTacchio Luciano
Doyle Francis J.
Duffield Giles E.
Dunlap Jay C.
Eckel-Mahan Kristin
Esser Karyn A.
FitzGerald Garret A.
Forger Daniel B.
Francey Lauren J.
Fu Ying-Hui
Gachon Frédéric
Gatfield David
Golden Susan S.
Green Carla
Harer John
Harmer Stacey
Haspel Jeff
Hastings Michael H.
Herzel Hanspeter
Herzog Erik D.
Hoffmann Christy
Hogenesch John B.
Hong Christian
Hughes Michael E.
Hughey Jacob J.
Hurley Jennifer M.
Johnson Carl
Kay Steve A.
Koike Nobuya
Kornacker Karl
Kramer Achim
Lamia Katja
Leise Tanya
Lewis Scott A.
Li Jiajia
Li Xiaodong
Liu Andrew C.
Loros Jennifer J.
Martino Tami A.
Menet Jerome S.
Merrow Martha
Millar Andrew J.
Mockler Todd
Naef Felix
Nagoshi Emi
Nitabach Michael N.
Nusinow Dmitri A.
Olmedo Maria
Ptáček Louis J.
Rand David
Reddy Akhilesh B.
Robles Maria S.
Roenneberg Till
Rosbash Michael
Ruben Marc D.
Rund Samuel S.C.
Sancar Aziz
Sassone-Corsi Paolo
Sehgal Amita
Sherrill-Mix Scott
Skene Debra
Storch Kai-Florian
Takahashi Joseph S.
Ueda Hiroki R.
Wang Han
Weitz Charles
Westermark Pål O.
Wijnen Herman
Wu Gang
Xu Ying
Yoo Seung-Hee
Young Michael
Zhang Eric Erquan
Zielinski Tomasz
Publication venue: 'SAGE Publications'
Publication date: 16/01/2019
Field of study

University of Surrey